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METHODS OF USING A MYCOBACTERIUM TUBERCULOSIS 
CODING SEQUENCE TO FACILITATE STABLE AND HIGH YIELD 
EXPRESSION OF HETEROLOGOUS PROTEINS 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority to provisional application U.S.S.N. 
60/158,585, filed October 7, 1999, the disclosure of which is herein incorporated by 
reference in its entirety. 

TECHNICAL FIELD 
The present invention relates generally to nucleic acid and amino acid 
sequences of a fusion polypeptide comprising a Mycobacterium tuberculosis polypeptide, 
and a heterologous polypeptide of interest, expression vectors and host cells comprising 
such nucleic acids, and methods for producing such fusion polypeptides. In particular, 
the invention relates to materials and methods of using suchM tuberculosis sequence as 
a fusion partner to facilitate the stable and high yield expression of recombinant 
heterologous polypeptides of both eukaryotic and prokaryotic origin. 

BACKGROUND OF THE INVENTION 
The advent of recombinant DNA technology has led to the molecular 
cloning of a large number of coding sequences or genes from diverse cell types. In order 
to study the function of these genes or to produce the products encoded by such 
sequences, these genes are inserted in expression vectors under the control of appropriate 
regulatory sequences. This transfer of the expression vector into a eukaryotic or 
prokaryotic host cell generally results in the expression of the encoded product which can 
be subsequently purified. Large-scale production of many gene products is particularly 
important in cases where such products are of medical or industrial value. 

However, notwithstanding the advances in gene expression, certain coding 
sequences do not readily produce their products in stable form. For example, expression 
in E. coli of recombinant proteins could be problematic particularly for proteins with 
trans-membrane domains or extensive hydrophobic sequences. Moreover, recombinant 
proteins may not contain the N-terminal amino acid residues with the appropriate codon 
bias. Thus, there remains a need for improved materials and methods for the expression 
of recombinant proteins. 



SUMMARY OF THE INVENTION 
The present invention provides for the first time recombinant nucleic acid 
molecules that encode fusion polypeptides comprising a Ral2 polypeptide and a 
heterologous polypeptide, fusion polypeptides, expression vectors and host cells 
comprising the nucleic acid molecules. The present invention further provides methods 
of using such recombinant nucleic acid molecules, expression vectors, and host cells to 
produce stable and high yield expression of fusion polypeptides of interest. 

In one aspect, the present invention provides recombinant nucleic acid 
molecules that encode a fusion polypeptide, the recombinant nucleic acid molecules 
comprising a Ral2 polynucleotide sequence and a heterologous polynucleotide sequence, 
wherein the Ral2 polynucleotide sequence hybridizes to SEQ ID NO:3 under stringent 
conditions. In one embodiment, the recombinant nucleic acid molecules comprise a Ral2 
polynucleotide sequence which is located 5' to a heterologous polynucleotide sequence. 
In another embodiment, the recombinant nucleic acid molecules further comprise a 
polynucleotide sequence that encodes a linker peptide between the Ral2 polynucleotide 
sequence and the heterologous polynucleotide sequence, wherein the linker peptide may 
comprise a cleavage site. In yet another embodiment, the recombinant nucleic acid 
molecules encode fusion polypeptides which further comprise an affinity tag. In yet 
another embodiment, the recombinant nucleic acid molecules encode a fusion polypeptide 
comprising a DPPD, a WT1, a mammaglobin, or a H9-32A heterologous polypeptide. In 
yet another embodiment, the recombinant nucleic acid molecules comprise a Ral2 
polynucleotide sequence comprising at least about 30 nucleotides, at least about 60 
nucleotides, or at least about 100 nucleotides. In yet another embodiment, the 
recombinant nucleic acid molecules comprise a Ral2 polynucleotide sequence as shown 
in SEQ ID NO:3. In yet another embodiment, the recombinant nucleic acid molecules 
comprise a Ral2 polynucleotide sequence that encodes a Ral2 polynucleotide as shown 
in SEQ ID NO:4, SEQ ID NO: 17 or SEQ ID NO: 18. 

In another aspect, the present invention provides expression vectors 
comprising a promoter operably linked to a recombinant nucleic acid molecule according 
to any one of embodiments described herein. 

In yet another aspect, the present invention provides host cells comprising 
expression vectors according to any one of embodiments described herein. In a preferred 
embodiment, the host cell is E. coli. 



In yet another aspect, the present invention provides fusion polypeptides 
comprising a Ral2 polypeptide and a heterologous polypeptide, wherein the Ral2 
polypeptide is encoded by a Ral2 polynucleotide sequence that hybridizes to SEQ ID 
NO:3 under stringent hybridization conditions. In one embodiment, the Ral2 polypeptide 
comprises at least about 10 amino acids, at least about 30 amino acids, or at least about 
100 amino acids. In another embodiment, the Ral2 polypeptide has a sequence as shown 
in SEQ ID NO:4, SEQ ID NO: 17, or SEQ ID NO: 18. 

In yet another aspect, the present invention provides methods of producing 
fusion polypeptides, the method comprising expressing in a host cell a recombinant 
nucleic acid molecule that encodes a fusion polypeptide, the fusion polypeptide 
comprising a Ral2 polypeptide and a heterologous polypeptide, wherein the Ral2 
polypeptide is encoded by a Ral2 polynucleotide sequence that hybridizes to SEQ ID 
NO:3 under stringent conditions. In one embodiment, the method further comprises 
purifying fusion polypeptides after their expression. In another embodiment, the method 
further comprises cleaving a fusion polypeptide between a Ral2 polypeptide and a 
heterologous polypeptide. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates a nucleotide sequence (SEQ ID NO:l) and an amino 
acid sequence (SEQ ID NO:2) of MTB32A. 

Figure 2 illustrates a nucleotide sequence (SEQ ID NO:3) and an amino 
acid sequence (SEQ ID NO:4) of Ral2. 

Figure 3 illustrates a recombinant nucleic acid sequence comprising a 
nucleotide sequence (SEQ ID NO:5) and an amino acid sequence (SEQ ID NO:6) of 
Ral2-DPPD fusion polypeptide. 

Figure 4 illustrates a recombinant nucleic acid sequence comprising a 
nucleotide sequence (SEQ ID NO:7) and an amino acid sequence (SEQ ID NO:8) of 
Ral2-WT1 fusion polypeptide. 

Figure 5 illustrates a recombinant nucleic acid sequence comprising a 
nucleotide sequence (SEQ ID NO:9) and an amino acid sequence (SEQ ID NO: 10) of 
Ral2-mammaglobin fusion polypeptide. 
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Figure 6 illustrates a recombinant nucleic acid sequence comprising a 
nucleotide sequence (SEQ ID NO: 11) and an amino acid sequence (SEQ ID NO: 12) of 
Ral2-H9-32A fusion polypeptide. 

Figure 7 illustrates Ral2(short) polypeptide (SEQ ID NO: 17), which has 
amino acids 1-30 of SEQ ID NO:3. 

Figures 8 illustrates Ral2(long) polypeptide (SEQ ID NO: 18), which has 
amino acids 1-128 of SEQ ID NO:4. 

Figure 9 illustrates a construct of Ral2 (short) polynucleotide fused to a 
human mammaglobin gene. 

DETAILED DESCRIPTION OF THE INVENTION 
As noted above, the present invention provides for the first time 
recombinant nucleic acid molecules, expression vectors, host cells, fusion polypeptides, 
and methods for producing fusion polypeptides, using & Mycobacterium tuberculosis 
coding sequence, namely a Ral2 nucleic acid which is a subsequence of a MTB32A 
nucleic acid. In particular, the invention provides materials and methods for using Ral2 
sequences as a fusion partner to facilitate the stable and high yield expression of 
recombinant heterologous polypeptides of both eukaryotic and prokaryotic origin. 

MTB32A is a serine protease of 32 KD molecular weight encoded by a 
gene in virulent and avirulent strains of M. tuberculosis. The complete nucleotide 
sequence (SEQ ID NO:l) and amino acid sequence (SEQ ID NO:2) of MTB32A are 
disclosed in Figure 1. See, also, Skeiky et al, Infection and Immun. (1999) 67:3998- 
4007, incorporated herein by reference. This protein is naturally secreted into the 
supernatant of bacterial cultures. The open reading frame of the coding sequence 
contains N-terminal hydrophobic secretory signals. It stimulates peripheral blood 
mononuclear cells from healthy purified protein derivative (PPD)-positive donors to 
proliferate and secrete interferon. Thus, MTB32A is a candidate antigen for use in 
vaccine development against tuberculosis. 

Surprisingly, it was discovered by the present inventors that a 14 KD C- 
terminal fragment of the MTB32A coding sequence expresses at high levels on its own 
and remains as a soluble protein throughout the purification process. This 14 KD C- 
terminal fragment of the MTB32A is referred herein as Ral2 (having amino acid residues 
192 to 323 of MTB32A). The nucleic acid and amino acid sequences of native Ral2 are 
shown, e.g., in Figures 2-6. As described in detail below, the term "Ral2 polypeptide" or 



"Ral2 polynucleotide" as used herein refer to the native Ral2 sequences (e.g., SEQ ID 
NO:3 or SEQ ID NO:4), their variants, or fragments thereof (e.g., SEQ ID NO:17 or SEQ 
ID NO: 18). The present invention utilizes these properties of Ral2 polypeptides and 
provides recombinant nucleic acid molecules, expression vectors, host cells, and methods 
5 for stable and high yield expression of fusion polypeptides comprising a Ral2 

polypeptide and a heterologous polypeptide of interest. The materials and methods of the 
present invention are particularly useful in expressing certain heterologous polypeptides 
(e.g., DPPD) that other conventional expression methods failed to express in any 
substantial quantity. 

10 

Recombinant Fusion Nucleic Acids 

Recombinant nucleic acids, which encode a fusion polypeptide comprising 

: a Ral2 polypeptide and a heterologous polypeptide of interest, can be readily constructed 
by conventional genetic engineering techniques. Recombinant nucleic acids are 

'-15 constructed so that, preferably, a Ral2 polynucleotide sequence is located 5' to a selected 
heterologous polynucleotide sequence. It may also be appropriate to place a Ral2 
polynucleotide sequence 3' to a selected heterologous polynucleotide sequence or to 
insert a heterologous polynucleotide sequence into a site within a Ral2 polynucleotide 
sequence. 

!-20 In the present invention, any suitable heterologous polynucleotide of 

2 interest can be selected as a fusion partner to Ral2 nucleic acids to produce a fusion 

polypeptide. A "heterologous sequence" or a "heterologous nucleic acid," as used herein, 
is one that originates from a source foreign to the particular host cell, or, if from the same 
source, is modified from its original form. Thus, a heterologous nucleic acid in a 

25 prokaryotic host cell includes a heterologous nucleic acid that is endogenous to particular 
host cell that has been modified. Modification of the heterologous sequence may occur, 
e.g., by treating the DNA with a restriction enzyme to generate a DNA fragment that is 
capable of being operably linked to the promoter. Techniques such as site-directed 
mutagenesis are also useful in modifying a heterologous sequence. 

30 A heterologous nucleic acid from both eukaryotic and prokaryotic origins 

can be selected as a fusion partner. These nucleic acids include, but are not limited to, 
nucleic acids that encode pathogenic antigens, bacterial antigens, viral antigens, cancer 
antigens, tumor antigens, and tumor suppressors. Exemplary heterologous nucleic acids 
of interest include DPPD, WT1, mammaglobin, H9-32A nucleic acids, and other 
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Mycobacterium tuberculosis nucleic acids {see, e.g., Cole et al. Nature (1999) 393:537- 
544; http://www.sanger.ac.uk; and http://www.pasteur.fr/mycdb/ for the complete 
genome sequences of M. tuberculosis; see, also WO98/53075 and WO98/53076, both of 
which are published on November 26, 1998 for nucleic acid sequences that encode M. 
5 tuberculosis proteins). Any one of the nucleic acids disclosed herein can be used alone or 
in combination as a heterologous nucleic acid that can be selected as a fusion partner. 

In addition, any suitable Ral2 polynucleotide {e.g., native Ral2 
polynucleotide having SEQ ID NO:3, variants or fragments thereof) can be used in 
constructing recombinant fusion nucleic acids of the present invention. Preferred Ral2 
10 polynucleotides comprise at least about 15 consecutive nucleotides, at least about 30 
nucleotides, at least about 60 nucleotides, at least about 100 nucleotides, at least about 
200 nucleotides, or at least about 300 nucleotides. Polynucleotides may be single- 
stranded or double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA 
molecules. 

= 15 In one embodiment, the Ral2 polynucleotide sequence is as shown in SEQ 

ID NO:3. In another embodiment, the Ral2 polynucleotide sequence encodes a Ral2 
polypeptide as shown in SEQ ID NO:4. In some embodiments, the Ral2 polynucleotide 
sequence comprises a portion of SEQ ID NO:3 or encodes a portion of SEQ ID NO :4. 
i- For instance, a Ral2 polynucleotide comprising 90 nucleotides {e.g., nucleotides 1-90 of 
Z 20 SEQ ID NO: 3), or a Ral2 polynucleotide comprising 384 nucleotides (e.g., nucleotides 1- 
; 384 of SEQ ED NO:3) can be used as a fusion partner. See Examples 2 and 3 below. 

Polynucleotides may comprise a native sequence {i.e., an endogenous 
sequence that encodes a Ral2 polypeptide SEQ ID NO:3 or a portion thereof) or may 
comprise a variant of such a sequence. Polynucleotide variants may contain one or more 
25 substitutions, additions, deletions and/or insertions such that the biological activity of the 
encoded fusion polypeptide is not diminished, relative to a fusion polypeptide comprising 
a native Ral2 polypeptide. Variants preferably exhibit at least about 70% identity, more 
preferably at least about 80% identity and most preferably at least about 90% identity to a 
polynucleotide sequence that encodes a native Ral2 polypeptide (SEQ ID NO:4) or a 
30 portion thereof. Optionally, the identity exists over a region that is at least about 25 to 
about 50 amino acids or nucleotides in length, or optionally over a region that is 75-100 
amino acids or nucleotides in length. 

Two polynucleotide or polypeptide sequences are said to be "identical" if 
the sequence of nucleotides or amino acids in the two sequences is the same when aligned 



for maximum correspondence as described below. Comparisons between two sequences 
are typically performed by comparing the sequences over a comparison window to 
identify and compare local regions of sequence similarity. A "comparison window" as 
used herein, refers to a segment of at least about 20 contiguous positions, usually 30 to 
5 about 75, 40 to about 50, in which a sequence may be compared to a reference sequence 
of the same number of contiguous positions after the two sequences are optimally aligned. 

Optimal alignment of sequences for comparison may be conducted using 
the Megalign program in the Lasergene suite of bioinformatics software (DNASTAR, 
Inc., Madison, WI), using default parameters. This program embodies several alignment 
10 schemes described in the following references: Dayhoff, M.O. (1978) A model of 

evolutionary change in proteins — Matrices for detecting distant relationships. In Dayhoff, 
M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research 
Foundation, Washington DC Vol. 5, Suppl. 3, pp. 345-358; Hein J. (1990) Unified 
Approach to Alignment and Phylogenes pp. 626-645 Methods in Enzymology vol. 1 83, 

^15 Academic Press, Inc., San Diego, CA; Higgins, D.G. and Sharp, P.M. (1989) CABIOS 
5:151-153; Myers, E.W. and Muller W. (1988) CABIOS 4:1 1-17; Robinson, E.D. (1971) 

]• Comb. Theor 11:105; Santou, N. Nes, M. (1987) Mol. Biol. Evol. 4:406-425; Sneath, 
P.H.A. and Sokal, R.R. (1973) Numerical Taxonomy - the Principles and Practice of 

~ = Numerical Taxonomy, Freeman Press, San Francisco, CA; Wilbur, W.J. and Lipman, DJ. 

320 (1983) Proc. Natl. Acad., Sci. USA 50:726-730. 

~. Alternatively, optimal alignment of sequences for comparison may be 

conducted by the local identity algorithm of Smith and Waterman (1981) Add. APL. Math 
2:482, by the identity alignment algorithm of Needleman and Wunsch (1970) J. Mol. 
Biol. 48:443, by the search for similarity methods of Pearson and Lipman (1988) Proc. 
25 Natl. Acad. Sci. USA 85: 2444, by computerized implementations of these algorithms 
(GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group (GCG), 575 Science Dr., Madison, WI), or by 
inspection. 

Preferred examples of algorithms that are suitable for determining percent 
30 sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, 

which are described in Altschul et al. (1977) Nucl. Acids Res. 25:3389-3402 and Altschul 
et al. (1990) /. Mol. Biol. 215:403-410, respectively. BLAST and BLAST 2.0 can be 
used, for example with the parameters described herein, to determine percent sequence 
identity for the polynucleotides and polypeptides of the invention. Software for 
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performing BLAST analyses is publicly available through the National Center for 
Biotechnology Information. For amino acid sequences, a scoring matrix can be used to 
calculate the cumulative score. Extension of the word hits in each direction are halted 
when: the cumulative alignment score falls off by the quantity X from its maximum 
5 achieved value; the cumulative score goes to zero or below, due to the accumulation of 
one or more negative-scoring residue alignments; or the end of either sequence is reached. 
The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the 
alignment. 

In one preferred approach, the "percentage of sequence identity" is 

10 determined by comparing two optimally aligned sequences over a window of comparison 
of at least 20 positions, wherein the portion of the polypeptide sequence in the 
comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less, 
usually 5 to 15 percent, or 10 to 12 percent, as compared to the reference sequences 
(which do not comprise additions or deletions) for optimal alignment of the two 

1 5 sequences. The percentage is calculated by determining the number of positions at which 
the identical amino acid residue occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in 
the reference sequence (i.e., the window size) and multiplying the results by 100 to yield 
the percentage of sequence identity. 
;20 Variants may also, or alternatively, be substantially homologous to a 

- native Ral 2 polynucleotide (e.g. , SEQ ID NO:3), or a portion or complement thereof. 
Such polynucleotide variants are capable of hybridizing under stringent conditions to a 
naturally occurring DNA sequence encoding a native Ral 2 polynucleotide (or a 
complementary sequence). 

25 The phrase "selectively (or specifically) hybridizes to" refers to the 

binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence 
under stringent hybridization conditions when that sequence is present in a complex 
mixture (e.g., total cellular or library DNA or RNA). 

The phrase "stringent hybridization conditions" refers to conditions under 

30 which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acid, but to no other sequences. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
Tijssen, Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic 



Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10 °C lower than the 
thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The 
T m is the temperature (under defined ionic strength, pH, and nucleic concentration) at 
5 which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt 
concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium 
ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 
10 30°C for short probes {e.g., 1 0 to 50 nucleotides) and at least about 60°C for long probes 
{e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the 
addition of destabilizing agents such as formamide. For selective or specific 
Q hybridization, a positive signal is at least two times background, preferably 10 times 
31 background hybridization. Exemplary stringent hybridization conditions can be as 
r 15 following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% 
rU SDS, incubating at 65°C, with a wash in 0.2x SSC, and 0.1% SDS at 65°C. 
j| It will be appreciated by those of ordinary skill in the art that, as a result of 

the degeneracy of the genetic code, there are many nucleotide sequences that encode a 
Q Ral2 polypeptide as described herein. Some of these polynucleotides bear minimal 
5=20 homology to the nucleotide sequence of any native gene. Nonetheless, polynucleotides 
that vary due to differences in codon usage are specifically contemplated by the present 
invention. Further, alleles of the genes comprising the polynucleotide sequences 
provided herein are within the scope of the present invention. Alleles are endogenous 
genes that are altered as a result of one or more mutations, such as deletions, additions 
25 and/or substitutions of nucleotides. The resulting mRNA and protein may, but need not, 
have an altered structure or function. Alleles may be identified using standard techniques 
(such as hybridization, amplification and/or database sequence comparison). 

Thus, the terms such as "Ral2 polynucleotide" or "Ral2 polynucleotide 
sequence" as used herein refer to native Ral2 polynucleotide sequences {e.g., SEQ ID 
30 NO:3), fragments thereof, or any variants thereof. Functionally, any Ral2 polynucleotide 
has the ability to produce a fusion protein, and its ability to produce a fusion proteins in 
host cells may be enhanced or unchanged, relative to the native Ral2 polynucleotide 
{e.g., SEQ ID NO:3), or may be diminished by less than 50%, and preferably less than 
20%, relative to the native Ral2 polynucleotide. 
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Nucleic acids encoding Ral2 polypeptides of this invention can be 
prepared by any suitable method known in the art. Exemplary methods include cloning 
and restriction of appropriate sequences or direct chemical synthesis by methods such as 
the phosphotriester method of Narmg et al. (1979) Meth. Enzymol. 68: 90-99; the 
5 phosphodiester method of Brown et al. (1979) Meth. Enzymol. 68: 109-151; the 

diethylphosphoramidite method of Beaucage et al. (1981) Tetra. Lett, 22: 1859-1862; 
and the solid support method of U.S. Patent No. 4,458,066. 

In one embodiment, a nucleic acid encoding MTB32A or Ral2 is isolated 
by routine cloning methods. Nucleotide sequences of MTB32A or Ral2 as provided 
10 herein are used to provide probes that specifically hybridize to other MTB32A or Ral2 
nucleic acids in a genomic DNA sample, or to a MTB32A mRNA or Ral2 mRNA in a 
total RNA sample {e.g., in a Southern or Northern blot). Once the target MTB32A or 
Ral2 nucleic acids are identified, it can be isolated according to standard methods known 
to those of skill in the art. 

= 15 The desired nucleic acids can also be cloned using well known 

amplification techniques. Examples of protocols sufficient to direct persons of skill 
through in vitro amplification methods, including the polymerase chain reaction (PCR) 
the ligase chain reaction (LCR), Qp-replicase amplification and other RNA polymerase 

- mediated techniques are found in Berger, Sambrook, and Ausubel, as well as Mullis et al. 

^20 (1987) U.S. Patent No. 4,683,202; PCR Protocols A Guide to Methods and Applications 

I. (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis); Arnheim & 

Levinson (October 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991) 3: 81- 
94; (Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1 173; Guatelli et al. (1990) Proc. 
Natl. Acad. Sci. USA 87: 1874; Lomell et al. (1989) J. Clin. Chem. 35: 1826; Landegren 
25 et al. (1988) Science 241: 1077-1080; Van Brunt (1990) Biotechnology 8: 291-294; Wu 
and Wallace (1989) Gene 4: 560; and Barringer et al. (1990) Gene 89: 1 17. Improved 
methods of cloning in vitro amplified nucleic acids are described in Wallace et al., U.S. 
Pat. No. 5,426,039. Suitable primers for use in the amplification of the nucleic acids of 
the invention can be designed based on the sequences provided herein. 
30 The MTB32A or Ral2 nucleic acids can also be cloned by detecting their 

expressed product by means of assays based on the physical, chemical, or immunological 
properties of the expressed protein. For example, one can identify a cloned MTB32A or 
Ral2 nucleic acid by the ability of a polypeptide encoded by the nucleic acid to bind with 
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antisera or purified antibodies made against the MTB32A or Ral2 polypeptides provided 
herein, which also recognize and selectively bind to the MTB32A or Ral2 homologs. 

In some embodiments, it may be desirable to modify the MTB32A or 
Ral2 nucleic acids of the invention. Altered nucleotide sequences which can be used in 
5 accordance with the invention include deletions, additions or substitutions of different 
nucleotide residues resulting in a sequence that encodes the same or a functionally 
equivalent gene product. The gene product itself may contain deletions, additions or 
substitutions of amino acid residues, which result in a silent change thus producing a 
functionally equivalent antigenic epitope. Such conservative amino acid substitutions 
10 may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity, and/or the amphipathic nature of the residues involved. Preferably, Ral2 
nucleic acids that are shorter in length than SEQ ID NO:3 that encode biologically active 
fusion partner can be used. Such smaller functional equivalents of Ral2 polypeptides 
: may be desirable to increase the amount of host cell resources that are available for the 
L 5 production of heterologous polypeptides of interest. 

One of skill will recognize many ways of generating alterations in a given 
nucleic acid construct. Such well-known methods include site-directed mutagenesis, PCR 
amplification using degenerate oligonucleotides, exposure of cells containing the nucleic 
■= acid to mutagenic agents or radiation, chemical synthesis of a desired oligonucleotide 
1-20 (e.g. , in conjunction with ligation and/or cloning to generate large nucleic acids) and other 
well-known techniques. See, e.g., Giliman and Smith (1979) Gene 8:81-97, Roberts et al. 
(1987) Nature 328: 731-734. 

Recombinant nucleic acids that encode a fusion polypeptide comprising a 
Ral2 polypeptide and a selected heterologous polypeptide can be prepared using any 
25 methods known in the art. As described above, recombinant nucleic acids are constructed 
so that a Ral2 polynucleotide sequence is located in any suitable place in a construct. 
Preferably, aRal2 polynucleotide sequence is located 5' to a selected heterologous 
polynucleotide sequence. Ral2 and heterologous polynucleotide sequences can also be 
modified to facilitate their fusion and subsequent expression of fusion polypeptides. For 
30 example, the 3' stop codon of the Ral2 polynucleotide sequence can be substituted with 
an in frame linker sequence, which may provide restriction sites and/or cleavage sites. 
The recombinant nucleic acids can further comprise other nucleotide sequences such as 
sequences that encode affinity tags to facilitate protein purification protocol. 
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Expression Vectors and Host Cells 

The recombinant nucleic acids as described herein can be joined to a 
variety of other nucleotide sequences using established recombinant DNA techniques. 
For example, a polynucleotide can be cloned into any of a variety of cloning vectors, 
5 including plasmids, phagemids, lambda phage derivatives and cosmids. Vectors of 
particular interest include expression vectors, replication vectors, probe generation 
vectors and sequencing vectors. In general, a vector will contain an origin of replication 
functional in at least one organism, convenient restriction endonuciease sites and one or 
more selectable markers. Other elements will depend on the desired use, and will be 

1 0 apparent to those of ordinary skill in the art. 

DNA sequences encoding the polypeptide components may be assembled 
separately, and ligated into an appropriate expression vector. The 3' end of the DNA 
sequence encoding one polypeptide component is ligated, with or without a 
polynucleotide sequence encoding a peptide linker, to the 5' end of a DNA sequence 

1 5 encoding the second polypeptide component so that the reading frames of the sequences 
are in phase. This permits translation into a single fusion protein that retains the 
biological activity of both component polypeptides. 

The ligated DNA sequences are operably linked to suitable transcriptional 
or translational regulatory elements. The regulatory elements responsible for expression 

20 of DNA are located only 5' to the DNA sequence encoding the first polypeptides. 

Similarly, stop codons required to end translation and transcription termination signals are 
only present 3' to the DNA sequence encoding the second polypeptide. 

Depending on the host/vector system utilized, any of a number of suitable 
transcription and translation elements, including constitutive and inducible promoters, 

25 may be used in the expression vector. For example, when cloning in bacterial systems, 
inducible promoters such as pL of bacteriophage X, plac, ptrp, ptac (ptrp-lac hybrid 
promoter; cytomegalovirus promoter) and the like may be used; when cloning in yeast 
cell systems, promoters such as ADHI, PGK, PH05, or the a factor promoter may be 
used; when cloning in insect cell systems, promoters such as the baculovirus polyhedron 

30 promoter may be used; when cloning in plant cell systems, promoters derived from the 
genome of plant cells (e.g. , heat shock promoters; the promoter for the small subunit of 
RUBISCO; the promoter for the chlorophyll <x/{3 binding protein) or from plant viruses 
(e.g., the 35S RNA promoter of CaMV ; the coat protein promoter of TMV) may be used; 
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when cloning in mammalian cell systems, promoters derived from the genome of 
mammalian cells (e.g., metallothionein promoter) or from mammalian viruses (e.g., the 
adenovirus late promoter; the vaccinia virus 7.5K promoter) may be used; when 
generating cell lines that contain multiple copies of a the antigen coding sequence, SV40-, 
5 BPV- and EBV-based vectors may be used with an appropriate selectable marker. 

A variety of host-expression vector systems may be utilized to express a 
Ral2 fusion protein coding sequences. These include, but are not limited to, 
microorganisms such as bacteria (e.g., E. coli, B. subtilis) transformed with recombinant 
bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing a 
10 coding sequence; yeast (e.g., Saccharomycd.es, Pichia) transformed with recombinant 
yeast expression vectors containing a coding sequence; insect cell systems infected with 
recombinant virus expression vectors (e.g., baculo virus) containing a coding sequence; 
plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower 
3 mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with recombinant 
Y 1 5 plasmid expression vectors (e.g. , Ti plasmid) containing a coding sequence; or 
1 mammalian cell systems (e.g., COS, CHO, BHK, 293, 3T3 cells). The expression 
T elements of these systems vary in their strength and specificities. 

Bacterial systems are preferred for the expression of Ral2 fusion 
- polypeptides. Commonly used prokaryotic control sequences, which are defined herein 
i20 to include promoters for transcription initiation, optionally with an operator, along with 
f ; ribosome binding site sequences, include such commonly used promoters as the beta- 
lactamase (penicillinase) and lactose (lac) promoter systems (Change et ah, Nature 
(1977) 198: 1056), the tryptophan (trp) promoter system (Goeddel et ah, Nucleic Acids 
Res. (1980) 8: 4057), the tac promoter (DeBoer et al, Proc. Natl. Acad. Sci. U.S.A. 
25 (1983) 80:2 1-25); and the lambda-derived Pl promoter and N-gene ribosome binding site 
(Shimatake et al., Nature (1981) 292: 128). The particular promoter system is not critical 
to the invention, any available promoter that functions in prokaryotes can be used. 

Either constitutive or regulated promoters can be used in the present 
invention. Regulated promoters can be advantageous because the host cells can be grown 
30 to high densities before expression of the Ral2 fusion polypeptides is induced. High 

level expression of heterologous proteins slows cell growth in some situations. Regulated 
promoters especially suitable for use in E. coli include the bacteriophage lambda P L 
promoter, the hybrid trp-lac promoter (Amann et al, Gene (1983) 25: 167; de Boer et al, 
Proc. Natl. Acad. Sci. USA (1983) 80: 21, and the bacteriophage T7 promoter (Studier et 
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al, J. Mol. Biol. (1986); Tabor et al, (1985). These promoters and their use are 
discussed in Sambrook et al, (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., 
Vols. 1-3, Cold Spring Harbor Laboratory. 

For expression of Ral2 fusion polypeptides in prokaryotic cells other than 
5 E. coli, a promoter that functions in the particular prokaryotic species is required. Such 
promoters can be obtained from genes that have been cloned from the species, or 
heterologous promoters can be used. For example, the hybrid trp-lac promoter functions 
in Bacillus in addition to E. coli. 

A ribosome binding site (RBS) is conveniently included in the expression 
10 cassettes of the invention. An RBS in E. coli, for example, consists of a nucleotide 
sequence 3-9 nucleotides in length located 3-11 nucleotides upstream of the initiation 
codon (Shine and Dalgarno, Nature (1975) 254: 34; Steitz, In Biological regulation and 
=1 development: Gene expression (ed. R.F. Goldberger), vol. 1, p. 349, 1979, Plenum 
vi Publishing, NY). 

j U 5 When large quantities of the Ral2 fusion protein are to be produced, 

1=- vectors which direct the expression of high levels of fusion protein products that are 
j\ readily purified may be desirable. Such vectors include, but are not limited to, the E. coli 

expression vector pUR278 (Ruther et al. (1 983) EMBO J. 2: 1 791), in which a coding 
u I sequence may be ligated into the vector in frame with the lacZ coding region so that a 
1=20 hybrid protein is produced; pIN vectors (Inouye and Inouye (1985) Nucleic Acids Res. 
Jf 13:3101-3109; Van Heeke and Schuster (1989) J. Biol. Chem. 264:5503-5509); and the 
like. pGEX vectors may also be used to express foreign polypeptides as fusion proteins 
with glutathione S-transferase (GST). In general, such fusion proteins are soluble and can 
be purified easily from lysed cells by adsorption to glutathione-agarose beads followed by 
25 elution in the presence of free glutathione. For certain applications, it may be desirable to 
cleave the heterologous polypeptide of interest from the Ral2 fusion polypeptide after 
purification. This can be accomplished by any of several methods known in the art. For 
example, the pGEX vectors are designed to include thrombin or factor Xa protease 
cleavage sites so that the cloned fusion polypeptide of interest can be released from the 
30 GST moiety. See, e.g., Sambrook et al, supra.; Itakura et al, Science (1977) 198:1056; 
Goeddel et al, Proa Natl Acad. Sci. USA (1979) 76:106; Nagai et al, Nature (1984) 
309:810; Sung et al, Proc. Natl. Acad. Sci. USA (1986) 83:561. Cleavage sites can be 
engineered into the recombinant nucleic acids for the fusion proteins at the desired point 
of cleavage. 
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Fusion Polypeptides 

Within the context of the present invention, a "fusion" polypeptide 
comprises at least two parts: a Ral2 polypeptide as described herein, and a heterologous 
5 polypeptide of interest. In a fusion polypeptide, a Ral2 polypeptide is preferably fused, 
directly or indirectly, to the amino terminus of a heterologous polypeptide of interest, 
although fusion to the carboxy terminus of the heterologous polypeptide or insertion of 
the heterologous polypeptide into a site within an Ral2 polypeptide may also be 
appropriate. 

10 Any heterologous polypeptide of interest, either eukaryotic or prokaryotic 

origins, can be selected as a fusion partner to a Ral2 polypeptide. These heterologous 
polypeptides include, but are not limited to, pathogenic antigens, bacterial antigens, viral 
~ antigens, cancer antigens, tumor antigens, and tumor suppressors. Exemplary 
1 heterologous polypeptides include DPPD, WT1, mammaglobin, H9-32A polypeptides, or 
15 other M. tuberculosis proteins. Any one of these polypeptides can be used alone or in 
combination as a heterologous polypeptide that can be selected as a fusion partner. 

As noted above, a fusion polypeptide may comprise a native Ral2 
polypeptide (e.g., SEQ ID NO:4), a variant thereof, or a fragment thereof. A polypeptide 
= "variant," as used herein, is a polypeptide that differs from a native Ral2 polypeptide in 
;20 one or more substitutions, deletions, additions and/or insertions, such that the biological 
I activity of the polypeptide is not substantially diminished. In other words, the ability of a 
variant to produce fusion polypeptide in host cells may be enhanced or unchanged, 
relative to the native Ral2 protein, or may be diminished by less than 50%, and 
preferably less than 20%, relative to the native Ral2 protein. Such variants may 
25 generally be identified by modifying one of the above polypeptide sequences and 
evaluating the level of fusion polypeptide production in host cells, such as in E. coli. 
Exemplary variants include those in which a small portion (e.g., 1-30 amino acids, 
preferably 5-15 amino acids) has been removed from the N- and/or C-terminal of the 
native Ral2 polypeptides. In one embodiment, variants of native Ral2 polypeptides 
30 comprise at least about 5 amino acids, at least about 10 amino acids, at least about 30 
amino acids, at least about 50 amino acids, or at least about 100 amino acids. 

In one embodiment, the Ral2 polypeptide sequence is as shown in SEQ ID 
NO:4. In another embodiments, the Ral2 polypeptide sequence comprises a portion of 
SEQ ID NO:4. For instance, an Ral2 polypeptide comprising 30 amino acids (e.g., 



amino acids 1-30 of SEQ ID NO:4) or an Ral2 polypeptide comprising 128 amino acids 
(e.g., amino acids 1-128 of SEQ ID NO:4) can be used as a fusion partner. See Examples 
2 and 3 below. 

Polypeptide variants preferably exhibit at least about 70%, more preferably 
5 at least about 80% or at least about 90%, and most preferably at least about 95% identity 
(determined as described above) to the identified polypeptides. Optionally, identity exists 
over a region that is at least about 20 to about 50 amino acids in length, or optionally over 
a region that is 75-100 amino acids in length. 

Preferably, a variant contains conservative substitutions. A "conservative 
10 substitution" is one in which an amino acid is substituted for another amino acid that has 
similar properties, such that one skilled in the art of peptide chemistry would expect the 
secondary structure and hydropathic nature of the polypeptide to be substantially 
unchanged. Amino acid substitutions may generally be made on the basis of similarity in 
Z polarity, charge, solubility, hydrophobicity, hydrophilicity and/or the amphipathic nature 
"_15 of the residues. For example, negatively charged amino acids include aspartic acid and 
glutamic acid; positively charged amino acids include lysine and arginine; and amino 
acids with uncharged polar head groups having similar hydrophilicity values include 
leucine, isoleucine and valine; glycine and alanine; asparagine and glutamine; and serine, 
threonine, phenylalanine and tyrosine. Other groups of amino acids that may represent 
^20 conservative changes include: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, 
thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. A variant may 
also, or alternatively, contain nonconservative changes. In a preferred embodiment, 
variant polypeptides differ from a native sequence by substitution, deletion or addition of 
five amino acids or fewer. Variants may also (or alternatively) be modified by, for 
25 example, the deletion or addition of amino acids that have minimal influence on the 
immunogenicity, secondary structure and hydropathic nature of the polypeptide. 

Thus, the terms such as "Ral2 polypeptide" or "Ral2 polypeptide 
sequence" as used herein refer to native Ral2 polynucleotide sequences (e.g., SEQ ID 
NO:4), fragments thereof (e.g., SEQ ID NO: 17 or 18), or any variants thereof. 
30 Functionally, a Ral2 polypeptide has the ability to produce a fusion protein, and its 

ability to produce a fusion proteins in host cells may be enhanced or unchanged, relative 
to the native Ral2 polypeptide (e.g., SEQ ID NO:4), or may be diminished by less than 
50%, and preferably less than 20%, relative to the native Ral2 polypeptide. 
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As noted above, fusion polypeptides may be conjugated to a linker or other 
sequence for ease of synthesis, purification or identification of the polypeptide or to 
enhance binding of the polypeptide to a solid support. For example, a peptide linker 
sequence may be employed to separate a Ral2 polypeptide and a heterologous 
5 polypeptide of interest by a distance sufficient to ensure that each polypeptide folds into 
its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 
the fusion protein using standard techniques well known in the art. Suitable peptide 
linker sequences may be chosen based on the following factors: (1) their ability to adopt 
a flexible extended conformation; (2) their inability to adopt a secondary structure that 
10 could interact with functional epitopes on the first and second polypeptides; and (3) the 
lack of hydrophobic or charged residues that might react with the polypeptide functional 
epitopes. In certain embodiments, peptide linker sequences may contain Gly, Asn and 
O Ser residues. Other near neutral amino acids, such as Thr and Ala may also be used in the 
m linker sequence. Amino acid sequences which may be usefully employed as linkers 
'it 15 include those disclosed in Maratea et al, Gene 40:39-46, 1985; Murphy et al, Proc. Natl. 
H J Acad. Sci. USA 55:8258-8262, 1 986; U.S. Patent No. 4,935,233 and U.S. Patent 
Ml No. 4,751 ,1 80. The linker sequence may generally be from 1 to about 50 amino acids in 

length. Linker sequences are not required when the first and second polypeptides have 
C! non-essential N-terminal amino acid regions that can be used to separate the functional 
fi =20 domains and prevent steric interference. 

=f In a preferred embodiment, a linker can provide a specific cleavage site 

between a Ral2 polypeptide and a heterologous polypeptide of interest. Such a cleavage 
site may contain a target for proteolytic enzyme that includes, for example, enterokinase, 
Factor Xa, trypsin, collagenase, thrombin, ubiquitin hydrolase; or for chemical cleavage 

25 agents such as, for example, cyanogen bromide or hydroxyamine. 

A fusion polypeptide may optionally contain an affinity tag which is 
linked to the fusion polypeptide so that the purification of recombinant polypeptides can 
be simplified. For example, multiple histidine residues encoded by the tag allow the use 
of metal chelate affinity chromatography methods for the purification of fusion 

30 polypeptides. Other examples of affinity tag molecules include, Strep-tag, PinPoint, 
maltose binding protein, glutathione S-transferase, etc. See, e.g., Glick and Pasternak 
(1999) Molecular Biotechnology Principles and Applications of Recombinant DNA, 2 nd 
Ed., American Society for Microbiology, Washington, DC. 
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Fusion polypeptides may be prepared using any of a variety of well known 
techniques. Recombinant fusion polypeptides encoded by DNA sequences as described 
above may be readily prepared from the DNA sequences using any of a variety of 
expression vectors known to those of ordinary skill in the art. Expression may be 
5 achieved in any appropriate host cell that has been transformed or transfected with an 
expression vector containing a DNA molecule that encodes a recombinant polypeptide. 
Suitable host cells include prokaryotes, yeast and higher eukaryotic cells described above. 
Preferably, the host cell employed is E. coli. Supernatants from suitable host/vector 
systems which secrete recombinant protein or polypeptide into culture media may be first 

10 concentrated using a commercially available filter. Following concentration, the 

concentrate may be applied to a suitable purification matrix such as an affinity matrix or 
an ion exchange resin. Finally, one or more reverse phase HPLC steps can be employed 
to further purify a recombinant polypeptide. 

Portions and other variants having fewer than about 100 amino acids, and 

15 generally fewer than about 50 amino acids, may also be generated by synthetic means, 
using techniques well known to those of ordinary skill in the art. For example, such 
polypeptides may be synthesized using any of the commercially available solid-phase 
techniques, such as the Merrifield solid-phase synthesis method, where amino acids are 
sequentially added to a growing amino acid chain. See Merrifield, J. Am. Chem. Soc. 

20 §5:2149-2146, 1963. Equipment for automated synthesis of polypeptides is commercially 
available from suppliers such as Perkin Elmer/ Applied BioSystems Division (Foster City, 
CA), and may be operated according to the manufacturer's instructions. 

In general, polypeptides (including fusion proteins) and polynucleotides as 
described herein are isolated. An "isolated" polypeptide or polynucleotide is one that is 

25 removed from its original environment. For example, a naturally-occurring protein is 
isolated if it is separated from some or all of the coexisting materials in the natural 
system. Preferably, such polypeptides are at least about 90% pure, more preferably at 
least about 95% pure and most preferably at least about 99% pure. A polynucleotide is 
considered to be isolated if, for example, it is cloned into a vector that is not a part of the 

30 natural environment. 

In addition to providing stable and high yield expression of fusion 
polypeptides of interest, the recombinant fusion nucleic acids and fusion polypeptides of 
the invention can be used in a number of other methods. For example, the fusion 
polypeptide coding sequence of the invention can be used to encode a protein product for 
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use as an antigen for detecting serum antibodies. For example, the presence of serum 
antibodies to M tuberculosis antigens in an individual indicates that the individual is 
infected with M. tuberculosis. In standard diagnostic tests, serum antibodies to M. 
tuberculosis are detected by monitoring binding of serum antibodies to M. tuberculosis 
5 proteins. The fusion polypeptides of the invention are useful as sources of proteins for 
monitoring binding of serum antibodies to fusion proteins. 

Alternatively, the fusion polypeptide can be used as an immunogen to 
induce and/or enhance immune responses. Such coding sequences can be ligated with a 
coding sequence of another molecule such as a M. tuberculosis antigen, a cytokine or an 
10 adjuvant. Such polynucleotides may be used in vivo as a DNA vaccine (U.S. Patent Nos. 
5,589,466; 5,679,647; and 5,703,055). Alternatively, purified or partially purified fusion 
polypeptides or fragments may be used as vaccines or therapeutic compositions. Any of a 
= variety of methods known in the art can be employed to produce vaccines or therapeutic 
j| compositions comprising the fusion polypeptides of the present invention. 
'."15 

Protein Purification and Preparations 

J " Once a recombinant protein is expressed, it can be identified by assays 

based on the physical or functional properties of the product, including radioactive 
^ labeling of the product followed by analysis by gel electrophoresis, radioimmunoassay, 
520 ELISA, bioassays, etc. 

Once the encoded protein is identified, it may be isolated and purified by 
standard methods including chromatography (e.g. , high performance liquid 
chromatography, ion exchange, affinity, and sizing column chromatography), 
centrifugation, differential solubility, or by any other standard technique for the 
25 purification of proteins. See, generally, R. Scopes, Protein Purification, Springer- Verlag, 
N.Y. (1982), Deutscher, Methods in Enzymology Vol. 182: Guide to Protein Purification, 
Academic Press, Inc. N.Y. (1990). The actual conditions used will depend, in part, on 
factors such as net charge, hydrophobicity, hydrophilicity, etc., and will be apparent to 
those having skill in the art. The functional properties may be evaluated using any 
30 suitable assays. 

The functional properties of the fusion protein may be evaluated using any 

suitable assay such as antibody binding, induction of T cell proliferation, stimulation of 
cytokine production such as IL2, IL-4 and IFN-y. For the practice of the present 
invention, it is preferred that each fusion protein is at least 80% purified from other 
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proteins. It is more preferred that they are at least 90% purified. For in vivo 
administration, it is preferred that the proteins are greater than 95% purified. 

The purified proteins may be further processed before use. For example, 
the proteins may digested with a specific enzyme to separate the Ral2 polypeptide from 
5 the heterologous polypeptide. 

One of skill would recognize that modifications can be made to the 
recombinant nucleic acids and fusion polypeptides without diminishing their biological 
activity. Some modifications may be made to facilitate the cloning, expression, or 

10 incorporation of the tag molecule into a fusion polypeptide. Such modifications are well 
known to those of skill in the art and include, for example, a methionine added at the 
amino terminus to provide an initiation site, or additional amino acids {e.g., poly His) 
placed on either terminus to create conveniently located restriction sites or termination 
codons or purification sequences. 

15 The following Examples are offered by way of illustration and not by way 

of limitation. 

EXAMPLES 

The following examples describe experiments that illustrate that Ral2 
fusion constructs produced stable and high yield expression of fusion polypeptides. The 
20 following examples also illustrate that various Ral2 sequences can be used as a fusion 
partner. 

EXAMPLE 1 : The Full Length Ral2 Sequence (SEQ ID NO:4) as a Fusion Partner 
A. Construction of Expression Vectors 

25 Coding sequences of M tuberculosis antigens were modified by PCR in 

order to facilitate their fusion and subsequent expression of fusion protein. pET 17b 
vector (Novagen) was modified to include Ral2, a 14 kDa C-terminal fragment of the 
serine protease antigen MTB32A of M. tuberculosis. The 3' stop codon of the Ral2 
sequence was substituted with an in frame EcoRI site and the N-terminal end was 

30 engineered to code for six His-tag residues immediately following the initiator Met to 

facilitate a simple one step purification protocol of Ral2 recombinant proteins by affinity 
chromatography over Ni-NTA matrix. 

Specifically, the C-terminal fragment of antigen MTB32A was amplified 
by standard PCR methods using the oligonucleotide primers 5' CAA TTA CAT ATG 
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CAT CAC CAT CAC CAT CAC ACG GCC GCG TCC GAT AAC TTC (SEQ ID 
NO: 13) and the 3' oligonucleotide sequence is 5'-CTA ATC GAATTC GGC CGG GGG 
TCC CTC GGC CAA (SEQ ID NO: 14). The 450 bp product was digested with Ndel and 
EcoRI and cloned into the pET 17b expression vector similarly digested with the same 
5 enzymes. Expression of the recombinant Ral2 protein was accomplished following 
transformation into the E. coli BL-21 (pLysE) host cells (Novagen) and induction with 
IPTG. Following lysis of the E. coli cells and centrifugation at 10K rpm, recombinant 
Ral2 was found in the soluble supernatant fraction. Protein from the soluble supernatant 
was purified by affinity chromatography over an Ni-NTA column which remained 
10 soluble following dialysis in 1 x PBS. The amount of purified protein obtained was 
routinely in the 60 to 100 mg per liter range. 

DPPD sequence was engineered for expression as a fusion protein with 
Ral2 by designing oligonucleotide primers to specifically amplify the mature secreted 
form. The 5' oligonucleotide containing an enterokinase recognition site (DDDK) has the 
1 5 sequences 5 '-CAA TTA GAA TTC GAC GAC GAC GAC AAG GAT CCA CCT GAC 
=- CCG CAT CAG-3' (SEQ ID NO:15) and the 3' oligonucleotide sequence is 5'CAA TTA 
GAA TTC TCA GGG AGC GTT GGG CTG CTC (SEQ ID NO: 1 6). The resulting PCR 
amplified product was digested with EcoRI and subcloned into the EcoRI site of the pET- 
Ral2 vector. Following transformation into the E. coli host strain (XLl-blue; 
: "20 Stratagene), clones containing the correct size insert were submitted for sequencing in 
: order to identify those that were in frame with the Ral2 fusion. Subsequently, the DNA 
: of interest (Fig. 3) was transformed into the BL-21 (pLysE) bacterial host and fusion 
protein expressed following induction of the culture with IPTG. 

25 B. Expression and Purification of Fusion Proteins 

The recombinant (His-tag) Ral2-DPPD fusion protein was purified from 
500 ml of IPTG induced batch cultures from the soluble supernatant by affinity 
chromatography using the one step QIAexpress Ni-NTA Agarose matrix (QIAGEN, 
Chatsworth, CA) in the presence of 8M urea. Briefly, 20 ml of an overnight saturated 

30 culture of BL21 containing the pET construct was added into 500 ml of 2xYT media 
containing 50 ug/ml ampicillin and 34 ug/ml chloramphenicol, grown at 37°C with 
shaking. The bacterial cultures were induced with 2mM IPTG at an OD 560 of 0.3 and 
grown for an additional 3 h (OD 1.3 to 1.9). Cells were harvested from 500 ml batch 
cultures by centrifugation and resuspended in 20 ml of binding buffer (0. 1 M sodium 

35 phosphate, pH 8.0; 1 0 mM Tris-HCl, pH 8.0) containing 2mM PMSF and 20 ug/ml 
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leupeptin. E. coli was lysed by adding 15 mg of lysozyme and rocking for 30 min at 4°C 
following sonnication (4 x 30 sec). Lysed cells were spun at 12 k rpm for 30 min and 
urea was added directly to the supernatant at a final concentration of 8M. 

The supernatant was batch bound to Ni-NTA agarose resin (5 ml resin per 
5 500 ml inductions) by rocking at R/T for 1 h and the matrix passed over a column. The 
flow through was passed twice over the same column followed by three washes with 30 
ml each of wash buffer (0.1 M sodium phosphate and 10 mM Tris-HCL, pH 6.3) also 
containing 8 M urea. Bound protein was eluted with 30 ml of 100 mM imidazole in wash 
buffer and 5 ml fractions collected. Fractions containing the recombinant antigen were 

10 pooled, dialyzed against 10 mM Tris-HCl (pH 8.0) bound one more time to the Ni-NTA 
matrix, eluted and dialyzed in lxPBS (pH 7.4) or 10 mM Tris-HCL (pH 7.8). The yield 
of the purified recombinant fusion protein was in the 50 to 75 mg per liter of induced 
bacterial culture with greater than 95% purity representing a single band. Recombinant 
proteins were assayed for endotoxin contamination using the Limulus assay 

1 5 (BioWhittaker) and were shown to contain < 1 0 E.U./mg (< 1 ng LPS/mg). 

C. Generation of Antiserum 

The purified fusion protein (100 ug) was mixed with 1 00 ug of muramyl 
dipeptide, brought up to 1 ml with 1 x PBS and emulsified with 1 ml IFA (incomplete 
20 freunds; Life Technologies) adjuvant. The emulsion was injected at multiple sites s.c. 

into a female New Zealand rabbit (R&R Rabbitry, Stanwood, WA). The rabbit was given 
two subsequent boosters (100 ug antigen in IFA) 6 weeks apart and a final i.v. shot with 
100 ug of the recombinant protein again given after 6 weeks. One week after the final 
boost, the rabbit was sacrificed and serum was collected and stored at -20°C. 

25 

D. Immunoblotting Analysis 

M. tuberculosis (strain H37Rv) total lysate or PPD (2.5 jig each) and 25 ng 
of the purified recombinant Ral2-DPPD fusion protein were separated by electrophoresis 
on 16% SDS-PAGE gels and transferred to nitrocellulose using a semi-dry transfer 
30 apparatus (BioRad). Blots, in duplicate, were blocked for a minimum of 1 hr with 
PBS/0. 1% Tween and probed with polyclonal sera from the same rabbit prior to 
immunization or post immunization with the purified recombinant fusion protein (diluted 
1 :500 in PBS/0. 1% Tween 20). Reactivity was assessed as previously using [ 125 I]- 
protein A, followed by autoradiography. 
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E. Results 

Several expression systems were initially evaluated for the expression of 
DPPD in E. coli. This included sub-cloning of DPPD coding sequence as non-fusion 
constructs in 1) pET 17b (Novagen) and pQ30 (Qiagen, Santa Clarita, CA) or 2) as fusion 
5 constructs using pET32A (Novagen, Madison, WI) or pGEX-2T (Pharmacia Biotech, 
Piscataway, NJ). In all of these systems, very little if any DPPD was expressed and 
purified. 

In contrast, when the DPPD coding sequence was inserted 3' to the Ral2 
sequence in an expression vector and transformed into E. coli, a large amount of Ral2- 

10 DPPD fusion protein was produced. The nucleotide sequence (SEQ ID NO:5) and amino 
acid sequence (SEQ ID NO:6) of Ral2-DPPD are disclosed in Figure 3. The 
immunogenicity of DPPD was maintained as evidenced by the ability of antiserum to 
react with the purified protein in immunoblotting analysis. In addition, three other 
proteins of eukaryotic or prokaryotic origin (see Figures 4-6) were also successfully 

15 expressed by the Ral2 fusion constructs. Thus, the Ral2 coding sequence is useful as a 
fusion partner in an expression construct to facilitate the expression of a heterologous 
sequence. 

EXAMPLE 2 : Short Ral2 Polypeptide (SEQ ID NO: 17) as a Fusion Partner 
-20 In this example, a Ral2 polypeptide comprising amino acids 1-30 of SEQ 

ID NO:4 was used as a fusion partner to link with the full length human mammaglobin 
gene. This short form of Ral2 polypeptide has the amino acid sequence shown in SEQ 
ID NO:17, and is referred to herein as "Ral2(short)". 

As shown in Figure 9, the 3' end of the Ral2(short) sequence is fused to 
25 the full length human mammaglobin gene. Specifically, the human mammaglobin gene 
was amplified by standard PCR methods using the following oligonucleotide primers: the 
5' primer, Hind III site: 5 '-gcgaagcttATGAAGTTGCTGATGGTCCTC ATGC-3 ' (SEQ 
ID NO: 19); the 3' primer, Xhol site: 5'- 

cggctcgagTTAAAATAAATCACAAAGACTGCTGTC-3' (SEQ ID NO:20). The 5' 
30 Hind III and 3' Xho I sites were added to assist subcloning into a vector. The N-terminal 
end of the fusion construct was engineered to code for six His-tag residues immediately 
following the Met to facilitate purification protocols. The expression of the fusion 
construct was accomplished following transformation into E. coli using procedures 
similar to those described in Example 1 . Compared to a construct without a Ral2(short) 
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sequence, the fusion construct with a Ral2(short) sequence substantially increased the 
expression of the fusion Ral2(short)-mammaglobin protein. 

EXAMPLE 3 : Longer Ral2 Polypeptide (SEQ ID NO: 18) as a Fusion Partner 
5 In this example, a Ral2 polypeptide comprising amino acids 1-128 of SEQ 

ID NO:4 was used as a fusion partner to link with the full length human mammaglobin 
gene. This long form of Ral2 polypeptide has the amino acid sequence shown in SEQ ID 
NO: 18, and is referred to herein as "Ral2(long)'\ Cloning and expression procedures 
similar those described in Example 2 were used. Compared to a construct without a 
10 Ral2(long) sequence, the fusion construct with a Ral2(long) sequence substantially 
increased the expression of the fusion Ral2(long)-mammaglobin protein. 

The present invention is not to be limited in scope by the exemplified 
! embodiments which are intended as illustrations of aspects of the invention, and any 
: 15 clones, nucleotide or amino acid sequences which are functionally equivalent are within 
the scope of the invention. Indeed, various modifications of the invention in addition to 
those described herein will become apparent to those skilled in the art from the foregoing 
description and accompanying drawings. Such modifications are intended to fall within 
the scope of the appended claims. It is also to be understood that all base pair sizes given 
= 20 for nucleotides are approximate and are used for purposes of description. 

All publications cited herein are incorporated by reference in their entirety. 
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WHAT IS CLAIMED IS: 

1 A recombinant nucleic acid molecule that encodes a fusion 

2 polypeptide, the recombinant nucleic acid molecule comprising a Ral2 polynucleotide 

3 sequence and a heterologous polynucleotide sequence, wherein the Ral2 polynucleotide 

4 sequence hybridizes to SEQ ID NO:3 under stringent conditions. 

1 2. The recombinant nucleic acid molecule according to claim 1 , 

2 wherein the Ral2 polynucleotide sequence is located 5' to the heterologous 

3 polynucleotide sequence. 

1 3. The recombinant nucleic acid molecule according to claim 1, the 

2 recombinant nucleic acid molecule further comprising a polynucleotide sequence that 

3 encodes a linker peptide between the Ral2 polynucleotide sequence and the heterologous 

4 polynucleotide sequence. 

1 4. The recombinant nucleic acid molecule according to claim 3, 

2 wherein the linker peptide comprises a cleavage site. 

1 5. The recombinant nucleic acid molecule according to claim 1, 

2 wherein the fusion polypeptide further comprises an affinity tag which is linked to the 

3 fusion polypeptide. 

1 6. The recombinant nucleic acid molecule according to claim 1, 

2 wherein the heterologous nucleic acid sequence encodes a DPPD, a WT1, a 

3 mammaglobin, or a H9-32A polypeptide. 

1 7. The recombinant nucleic acid molecule according to claim 1 , 

2 wherein the Ral2 polynucleotide sequence comprises at least about 30 nucleotides. 

1 8. The recombinant nucleic acid molecule according to claim 1 , 

2 wherein the Ral2 polynucleotide sequence comprises at least about 60 nucleotides. 

1 9. The recombinant nucleic acid molecule according to claim 1, 

2 wherein the Ral2 polynucleotide sequence comprises at least about 100 nucleotides. 
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1 10. The recombinant nucleic acid molecule according to claim 1, 

2 wherein the Ral2 polynucleotide sequence encodes a Ral2 polypeptide as shown in SEQ 

3 IDNO:17. 

1 11. The recombinant nucleic acid molecule according to claim 1 , 

2 wherein the Ral2 polynucleotide sequence encodes a Ral2 polypeptide as shown in SEQ 

3 IDNO:18. 

1 12. The recombinant nucleic acid molecule according to claim 1, 

2 wherein the Ral2 polynucleotide sequence is as shown in SEQ ID NO:3. 

1 13. The recombinant nucleic acid according to claim 1 , wherein the 

2 Ral2 polynucleotide sequence encodes a Ral2 polypeptide as shown in SEQ ID NO:4. 

1 14. An expression vector comprising a promoter operably linked to a 

2 recombinant nucleic acid molecule according to claim 1. 

1 15. A host cell transformed or transfected with an expression vector 

2 according to claim 14. 

1 16. The host cell according to claim 1 5, wherein the host cell is E. coli. 

1 Jiff. A fusion polypeptide comprising a Ral2 polypeptide and a 

2 heterologous polypeptide, wherein the Ral 2 polypeptide is encoded by a Ral2 

3 polynucleotide sequence that hybridizes to SEQ ID NO: 3 under stringent hybridization 

4 conditions. 

1 18. The fusion polypeptide according to claim 17, wherein the Ral 2 

2 polypeptide comprises at least about 10 amino acids. 

1 19. The fusion polypeptide according to claim 17, wherein the Ral2 

2 polypeptide comprises at least about 30 amino acids. 

1 20. The fusion polypeptide according to claim 17, wherein the Ral2 

2 polypeptide comprises at least about 100 amino acids. 

1 21. The fusion polypeptide according to claim 17, wherein the Ral2 

2 polypeptide has a sequence as shown in SEQ ID NO:4. 
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22. The fusion polypeptide according to claim 1 7, wherein the Ral 2 
polypeptide has a sequence as shown in SEQ ID NO: 17. 

23 . The fusion polypeptide according to claim 1 7, wherein the Ral 2 
polypeptide has a sequence as shown in SEQ ID NO: 18. 

24. The fusion polypeptide of claim 1 7, the fusion polypeptide further 
comprising a linker peptide between the Ral2 polypeptide and the heterologous 
polypeptide. 

25. The fusion polypeptide of claim 1 7, wherein the fusion polypeptide 
further comprises an affinity tag which is linked to the fusion polypeptide. 

26. The fusion polypeptide of claim 1 7, wherein the heterologous 
polypeptide is a DPPD, a WT1, a mammaglobin, or a H9-32A. 



41. A method of producing a fusion polypeptide, the method 
comprising expressing in a host cell a recombinant nucleic acid molecule that encodes a 
fusion polypeptide, the fusion polypeptide comprising a Ral 2 polypeptide and a 
heterologous polypeptide, wherein the Ral2 polypeptide is encoded by a Ral2 
polynucleotide sequence that hybridizes to SEQ ID NO:3 under stringent conditions. 

28 . The method according to claim 27, wherein the fusion polypeptide 
further comprises an affinity tag which is linked to the fusion polypeptide. 

29. The method according to claim 27, wherein the fusion polypeptide 
is purified from the host cell. 

30. The method according to claim 27, the method further comprising 
cleaving the fusion polypeptide between the Ral 2 polypeptide and the heterologous 
polypeptide. 




The method according to claim 27, wherein the host cell is E. coli. 
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METHODS OF USING A MYCOBACTERIUM TUBERCULOSIS CODING 
SEQUENCE TO FACILITATE STABLE AND HIGH YIELD EXPRESSION OF 
HETEROLOGOUS PROTEINS 

ABSTRACT OF THE DISCLOSURE 
The present invention relates generally to nucleic acid and amino acid 
sequences of a fusion polypeptide comprising a Mycobacterium tuberculosis polypeptide, 
and a heterologous polypeptide of interest, expression vectors and host cells comprising such 
nucleic acids, and methods for producing such fusion polypeptides. In particular, the 
invention relates to materials and methods of using suchM tuberculosis sequence as a fusion 
partner to facilitate the stable and high yield expression of recombinant heterologous 
polypeptides of both eukaryotic and prokaryotic origin. 

SF 1 143370 vl 
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Figure 1 



ACGGCCGCGTCCGArAACTrcCAGCfGTCCCAGGGrGGGCAGGGATTCGCCArTCCGATCGGGCAGGCGATGGCGA 



TAA SONFQLSQGG QGFA ! PIGQAMA 
rCGCGGGCCAGA''TCCGATCGGGTGGGGGGTCACCCACCGTTCATATCGGGCCTACCGCCTTCCTCGGCTrGGGTGTTGTCGACAACAACGGCAACGGCGC 



[AGO I R S GG GSPTVHt GPTAF LGLGV VGNNGNGA 
ACGAGTCCAACGCGTGGrCGGGAGCGCrCCGGCGGCAAGTCTCGGCATCTCCACCGGCGACGTGATCACCGCGGrCGACGGCGCTCCGATCAACTCGGCC 



H f==jf Q R VVG5APAASLGISTGQV [ T A V 0 -G A P IN5A 
ACCGtiA rGGCGGACGCGCrrAACGGGCATCATCCCGGTGACGTCArCTCGGTGACC TGGCAA ACC AAGTCGGGCGGC ACGCGTAC AGGGAACG TGACA T 



G H H P G 0 



3VTWQTXSGGTHTG 



TGGCJ1GAGGGACCCCCGGCC 



L *!! S G P P 



Figure 2 



Ps-.Z-OPPO.^PO (i > 702) Sua and Saat.-sfics 
Enzymes : Aii 515 enzymes (Mo F'ttar) 

Settings : Circular. Certain Sites Only, Standard Genetic Code 

CAUrG.-irCACCATCACCArCACACGGCCGCGrcCGATAACTrcCJOCTGTCCCAGGGrGGGCJGGG-TrcCCCATTCCGArCGGGCAGGCGAl-GGCGA 



ilHH HH HH T AA S0NFQ|_SQ GG QG.- A I PI G Q A fl A 
TCGCGGGCCAGATCCGA TCGGGrGGGGGGTCACCCACCGTTCArATCGGGCCTACCGCCTTCCTCGGCTTGGGTGTTGTCGACAACAACGGCAACGGCGC 



GO IRSGGGSPTVHIGP 



N G 



ACGAGTCCAACGCGTGGTCGGGAGCGCTCCGGCGGCAAGTCTCGGCATC TCCACCGGCG ACGTG ATC ACCGCGG TCGACGGCGCTCCGA TCAAC TCGGCC 



S A 



ACCGCGATGGCGGACGCGCTrAACGGGCATCATCCCGGTGACGTCATCTCGGrGACC ~GGCa'aACCAAG~CGGGCGGCACGCG TACAGGGAACGTGACAT 



-Ra12- 



T A =i; M AOALNGHHPGOV tSVTWQTKSGGTRTGNVT 
TGGCCSMGGGACCCCCGGCCGAATrCGACGACGACGACAAGGATCCACC fGACCCGCATCAGCCGGACATGACGAAAGGCTATTGCCCGGGTGGCCGATG 



Eneero kinase 



L A j=g G P P A EFOOOOKOPPpPHQPOM TKGYCPGGRW 
GGG T T jf^GGCGAC T TGGCCGrGTGCGACGGCGAGAAGTACCCCGACGGCTCGrTTrGGCACCAGTGGATGCAAACGTGGTTTACCGGCCCACAGrTTTAC 



TTCGAfJ|;TGTCAGCGGCGGTGAGCCCCTCCCCGGCCCGCGGCCACCGGGTGGT rGCGGTGGGGCAATTCCGTCCGAGCAGCCCAACGCTCCCTGAGAAT 

' ' ' - ' ' ' ' ' : ' : ' ' ' ' ■ <- 700 

'DPPD ) 
fr °CVSGGEPLPGPPPPGGCGGA[PS£QPNAP 



Figure 3 



2-WT33f.MPO (i > 1745) Site and Sequence 
:ymes : All 515 anzymes (No Filter) 

•inqs : Circular, Certain Sites Only, Standard Genetic Code 



ArGCArCACCArCACCATCACACGGCCGCGTCCGAiAACTTCCAGCTGTCCCAGGGTGGGCAGoGATrCGCCATTCCGATCGGGCAGGCGATGGCGA 



-Met(6xHis)" 

H H H 



HTAASDNFQLSGGGQGFAIPIGGAMA 
.GCGGGCCAGATCCGATCGGGTGGGGGGTCACCCACCGTTCATATCGGGCCTACCGCCTTCCTCGGCTTGG.GTGTTGTCGACAACAACGGCAACGGCGC 



' " 1 1 Ra1 2 - 

AGG IRSGGGSPTVHIGP 



FLGLGVVDNN GNGA 
iGAGTCCAACGCGTGGTCGGGAGCGCTCCGGCGGCAAGTCTCGGCATCTCCACCGGCGACGTGATCACCGCGGTCGACGGCGCrCCGATCAACTCGGCC 



R V Q R VVGSAPAASLG !STG0V I T A V 0 GAP [ N S A 
iCGCGATGGCGGACG.CGCTTAACGGGCATCATCCCGGTGACGTCATCTCGGTGACCTGGCAAACCAAGTCGGGCGGC ACGCGTACAGGGAACGTGAC AT 



— Ra12" 

! S V 



AfJ=AOALNGHHPGDV ISVTWQTKSGGTR T G N V T 
GCCGAG'l^hACCCCCGGCCGAATTCCCGCTGGrGCCGCGCGGCAGCCCGA TGGGCTCCGACGTTCGGGACCTGAACGCACTGCTGCCGGCAGTTCCGTC 



A £ = s G P P A £ F PLVPRGSPMG S0VR01NALIP AVPS 

TGGGTGBrGGTGGTGGrTGCGCACTGCCGGTrAGCGGTGCAGCACAGrGGGCTCCGGTTCTGGACrTCGCACCGCCGGGTGCATCCGCATACGGTTCC 
■ j^y ' 1 ' ' ' 1 ' 1 • : ' 1 1 ' 1 • i 

P ' 1 WT1 r i 

IGiBjGGGCAtPVSGAAQWAPVLOFAPPGASAYGS 
GGGrGGTCCGGCACCGCCGCCGGCACCGCCGCCGCCGCCGCCGCCGCCGCCGCACTCCTTCATCAAACAGGAACCGAGCTGG'GGrGGTGCAGAACCGC 

1 WT1 ■ 
GGPAPPPAPPPPPPPPPHSFIKQEPSWGGAEP 

GAAGAACAGTGCCTGAGCGCATTCACCGTTCACTTCTCCGGCCAGTrCACTGGCACAGCCGGAGCCTGTCGCTACGGGCCCTTCGGTCCTCCTCCGCC 
— — ' ■ ' ■ ' ■ ' ■ ' . 1 ■ ' > 1 ■ > , ■- i 

' WT1 ' 1 ' 

££aCLSAFTVHFSGQFTGTAGACRYGPFGPPPP 

GCCAGGCGTCATCCGGCCAGGCCAGGATGTTTCCTAACGCGCCCTACCTGCCCAGCTGCCTCGAGAGCCAGCCCGCTATTCGCAATCAGGGTTACAGC 



SSGQARMFPNAPYLPSCtESGPA IRNQGYS 



Figure 4 



■WT33f.MPQ (l > 17^-5) Site and Sequence 

GTCACCTTCGACGGGACGCCCAGCTdCGGTCACACGCCCTCGCACCATGCGGCGCAGT-CCCCAiCCiCTCATrCAAGCATGAGGATCCCATGGGCC 

— , . ■. . ' ' " 1 ' ' ' ' " ' ' ' ' - 100C 

" " ■ ' WT1 ' -■ 

VTFOGTPSYGHTPSHHAAQFPNHSFKHEDPMG 
AGGGCTCGCTGGGTGAGCAGCAGTACTCGGTGCCGCCCCCGGTCTATGGCTGCCACACCCCCACCGACAGCTGCACCGGCAGCCAGGCTTTGCrGCT 



QGSLGEGQYSVPPPVYGCHTPTOSCTGSQALLL 
GACGCCCTACAGCAGTGACAATTTATACCAAATOACArCCCAGCTTGAATGCATGACCTGGAATCAGATGAACTTAGGAGCCACCTTAAAGGGCCAC 

' WT1 ■ ' 

TP Y SSONLY QMTSQLECftTWNQMNLGAT.LK GH 
ACAGGGrACGAGAGCGATAACCACACAACGCCCATCCrCfGCGGAGCCCAATACAGAATACACACGCACGGTGrCTTCAGAGGCATTCAGGATGTGC 

1 1 " 1 WT1 ' 1 i 

TG V E50NHTTP ILCGAQYRIHTHGVFRGEQQV 

GTGTGdftGGAGTAGCCCCGACTCTTGTACGGTCGGCATCTGAGACCAGTGAGAAACGCCCCTTCATGTGTGCTTACTCAGGCTGCAATAAGAGATA 

R V ^ G V AP T L V R S A SETSE KRPFMC AYSGC NKRY 
TAAGCTtSTCCC AC TT AC AGATGCACAGCAGGAAGCACACTGGrG AG AA AC CATACCAGTGTGACTTCAAGGACTGTGAACGAAGGTTTTTTCGrTCA 

m 1 wti " 

KLSHLQMHSRKHTGEKPYQCOFKOCERRFFRS 
CAGCrCAAAAGACACCAAAGGAGACATACAGGTGTGAAACCArTCCAGTGTAAAACrTGTCAGCGAAAGTTCTCCCGGTCCGACCACCTGAAGACCC 

'WTI 1 
OLKRH QRRHTG VKPFQ CKTCQ RKFSRS DHLKT 

CCAGGACTCATACAGGTGAAAAGCCCTTCAGCTGTCGGTGGCCAAGTTGTCAGAAAAAGTTTGCCCGGTCAGATGAATTAGTCCGCCATCACAACAT 

11 WT1 ' " 1 

TR THTGEK PFSCRWP SCQ KKFARSDELVR HHNM 

TCAGAGAAACATGACCAAACTCCAGCTGGCGCTTTGAGAATTC 

— 1 ■ 1 ' 1 ' ' ^ I7<46 • 

1 WT1 ■ 

QRN MTKLQLAL 
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ai2-marnma-.MP0 (1> 672) Site and Sequence 
izyrnas : All 515 enzymes (No Filter) 

actings : Circular, Certain Sites Only, Standard Genetic Code 



; A f A TGC A TC ACC A r C ACC A TC AC AC GGC CGCG T CCG A T A AC f TC C AGC T GTC CCA GGG TG GGC AGGG A T 7 CGC C A f ~C C G A "C GGGCAGGC G A T GGC G A 



MHHHHHHTAAS0NFQLS0GGQGFA1 PI G Q A M A 
CGCGGGCCAGATCCGATCGGGTGGGGGGTCACCCACCGTTCATATCGGGCCTACCGCCTTCCTCGGCTTGGGTGTTGTCGACAACAACGGCAACGGCGC 



AGQIRSGGGSPTVHtGPTAFLGLGVVQNNGNGA 
ACGAGTCCAACGCGTGGfCGGGAGCGCTCCGGCGGCAAGTCTCGGCATCTCCACCGGCGACGTGATCACCGCGGTCGACGGCGCTCCGATCAACTCGGCC 



RVQRVVGSAPAASLGiSTGOV [TAVOGAP [MSA 
ACCGCGA TGGCGGACGCGCTTAACGGGC ATCATCCCGGTGACGTCATCTCGGTGACCTGGCA'AACCAAGTCGGGCGGCACGCGTACAGGGAACGTGACAT 



TAMAO AUN GHHPGQV I S V TWQTKSGGTRTGNV T 
rGGCCGA ; g|GACCCCCGGCCGAATTCATCGAGGGAAGGGGCTCTGGCTGCCCCrTATTGGAGAATGTGArTTCCAAGACAArCAATCCACAAGTGTCTAA 



*Ra12 - 



I A E G P P A £F [EGRGSGCPLLENVt SKTINPO.VSSC- 
GACTGAA|T^CAAAGAACTTCTTCAAGAGTTCATAGACGACAA TGCCACTACAA A TGCC A TAGA TGA AT TGA AGGAA TGTTTTCTTAACCAAACGGATGAA 



T E j*jr K ELL QEFIOONATTNA [OELKECFL 
ACTCTGASfcAATGTTGAGGTGTTTATGCAAtTAATATATGACAGCAGTCTTTGfGATTTATTTTAAGAATTC 
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Ral2-H9-32A.,VIP0 (1 > 2131) Site and Sequence 
Enzymes : All 515 enzymes (No Filter) 

Settings : Circular, Certain Sices Only, Standard Genetic Code 

A TGC A TC ACC A TC ACC A TC AC ACGGCCGCGTCCG A Ti AC TrCCAoCTGTCCCiGGG i GGGCAGGGATTCGCC AT TCCCA TCGGGCAGGCGA TGGCGA TCG 



flHHHH HHTAA SQNFQLSQ GGGGFA ! PtGQAfIA I 
CGGGCCAGATCCGATCGGGTGGGGGGTCACCCACCGTTCATATCGGGCCTACCGCCTTCCTCGGCTTGGGTGTTGTCGACAACAACGGCAACGGCGCACG 

'Ra12 ' ' i 

AGQ! RSGGGSPTVH IGPTAFLGLGVVONNGNGAR 

AGTCCAACGCG TGG TCGGGAGCGC TCCGGCGGCAAGTCTCGGCATCTCCACCGGCGACGTGATCACCGCGG TC GACGGCGCTCCGATCAACTCGGCCACC 



vqrv vgs apaaslg istgovt tavogap [ n5at 
gcgatggcggacgcgcttaacgggcatcatcccggtgacgtcatctcggtgacctggcaaac'caagtcgggcggcacgcgtacagggaacgtgacattgg 

'Ha12 ' i i 
A H A 0 A LNGHHPGQV I 5VTWQ TKSGG TRTGNVTL 

CCGAGjISACCCCCGGCCGAATTCATGGTGGATTTCGGGGCGTTACCACCGGAGATCAACTCCGCGAGGATGTACGCCGGCCCGGGTTCGGCCTCGCTGGT 

■ 1 ICHal 2 ■ ' " i fl^oax I MTB39 ' 

A EiiS P P A £ FM VO FGA LP P E [ N5 ARM Y A GPG SA SL V 

GGCCQ^GGCTCAGA TG TGGGAC AGCG TGGCGAGTGACC TG TT T TC GGCCGCGTCGGCG TTTC AG TCGGTGG TC TGGGG TCTG ACGGTGGGGTCGTGGATA 



A;3lAQnwOSVA5QLF5AA5AFQSVVWGLrVGSW[ 
.'GGrTqlJCGGCGGGrCTGATGGTGGCGGCGGCCTCGCCGTATGrGGCGTGGATGAGCGTCACCGCGGGGCAGGCCGAGCTGACCGCCGCCCAGGTCCGGG 

CI ■ MTB39 ■ 

G SS AG LflV AA ASP Y V AWM SV TAG 0 A £ L TAA QV R 

TTGCTGCGGCGGCCTACGAGACGGCGTATGGGCTGACGGTGCCCCCGCCGGTGATCGCCGAGAACCGTGCTGAACTGATGATTCTGATAGCGACCAACCT 

■ 1 , . , . , ■ . . , , , , , , , , . £ 

1 MTB39 i 
V A A A A Y E TAYGLTVPPPV [ AENRAELMIL [A TNL 

CTTGGGGC AAAACACCCCGGCGATCGCGGTCAACGAGGCCGAATACGGCGAGATGTGGGCCC AAGACGCCGCCGCGATGTTTGGCTACGCCGCGGCGACG 



L G Q N T P A [AVNEAEYGEMWAQOAAAMFGYAAAT 
GCGACGGCGACGGCGACGTTGCTGCCGTTCGAGGAGGCGCCGGAGATGACCAGCGCGGGTGGGCTCCTCGAGCAGGCCGCCGCGGTCGAGGAGGCCTCCG 

1 1 MTB39 ' ii 

ATA T A TLL PF££A PEMTS AGGLU £ 0. A A A V E £A S 
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ACACCGCCGCGGCGAACCiGTTGATGAACA4TGTGCCCCACGCGCTGCAACASCTGGCCCACCCCACGCAGGGCACCACGCCTrCTTCCAAGCTGGG IGG 
, , , , 1 < ■ ' : : . ■ ■ . l , 10Q 

1 MT839 ' i i ii 

OTAAANQLflNNVPQALQQLAQPTQGTTPSSKLGG 

CCrGTGGAAGACGGTCTCGCCGCArCGGTCGCCGATCAGCAACATGGTGTCGATGGCCAACAACCACATGTCGATGACCAACrCGGGTGTGTCGArGACC 
, , , , , . . 1 , . , 1 , . , < , 1 1 l ,2oq 

' ' MTB39 " ' ' 

L WKTVSPHRSP! SNMVSliANNHHSMTNSGVSMT 

AACACCTTGAGCTCGATGTTGAAGGGCTT rGCTCCGGCGGCGGCCGCCCAGGCCGTGC A A ACCGCGGCGCAAAACGGGGfCCGGGCGA TGAGC TCGC TGG 
, , , , _ — —i ■ 1 . ' ■ ' ■ ' ■ • • ' ■ <- 1300 

' 1111 MT839 ■ 

ntlssmlkgfapaaaaqavqtaaqngvramssl 
gcagcrcgcrgggtrcrrcgggtctgggcggtggggtggccgccaacttgggtcggggggcctcggtcggttcgttgtcggtgccgcaggcctgggccgc 

. _j , , , _. ■ ' . 1 1 i ■■ . 1 . 1 . u ] 400 

" MT839 ■■ 

G5SLGSSGLGGGVAANLGRAA3VG5L3VPQAWAA 

GGCCA?tCAGGCAGTCACCCCGGCGGCGCGGGCGCTGCCGCTGACCAGCCTGACCAGCGCCGCGGAAAGAGGGCCCGGGCAGATGCTGGGCGGGCTGCCG 
^ , , , , , . , 1 . , , , < , . , 1 . l , soo 

VM i — 11 -" MTB39 " - "" ■ " ■ ' 

A '5=QA V TP A A R A L PL TSL TS A~ A £ R G P G QML GG LP 

GTGGdGCAGArGGGCGCCAGGGCCGGTGGTGGGCrCAGrGGTGTGCTGCGTGTTCCGCCGCGACCCTATGTGATGCCGCArrcrCCGGCAGCCGGCGArA 
-H 1 ' 1 ' 1 : 1 : ' ' 1 ' ' 1 ' 1600 

MT339 | | sec 

V Q MG A R A GG GLS G.V LP V PP RP Y VH PH SPA AG 0 

TCGCCleGCCGGCCrTGtCGCAGGACCGGTTCGCCGACTrcCCCGCGCTGCCCCrCGACCCGTCCGCGATGGrCGCCCAAGrGGGGCCACAGGTGGTCAA 
, , . • 1 ' ' ■ ■■ ■ ' 1 ' ■ ' ■ ' ' 1700 

3 ] □ MT832A (N-ter) » ■ ' 11 ■ 

(aCIpalsqorfaofpalplopsamvaqvgpqvvn 
catcaacaccaaacrgggctacaacaacgccgrgggcgccgggaccggcatcgrcatcgatcccaacggrgtcgtgctgaccaacaaccacgtgatcgcg 

, i , ■ . . . 1 . . 1 , ■■ . ■ . . . K ] 800 

1 MT332A (N-ter) 1 1 ' ■ 

(NTKLGYNNAVGAGTGIVI DP NGVVLTNNHV (A 

GGCGCCACCGACATCAATGCGtrCAGCGTCGGCrCCGGCCAAACCTACGGCGTCGATGTGGTCGGGTATGACCGCACCCAGGArGTCGCGGrGCTGCAGC 
u - ' 1 < ■ ■ ■ ' ■ ' ■ ' ■ ' ■ ! 1 ' ' <- 1900 

■ ■■ ' 1 1 MT832A (N-ter) ■ 

GATQ[NAFSVGSGQTYGV0VVGY0RTQOVAVLQ 

TGCGCGGrGCCGGTGGCCTGCCGTCGGCGGCGATCGGTGGCGGCGTCGCGGTTGGTGAGCCCGTCGTCGCGATGGGCAACAGCGGTGGGCAGGGCGGAAC 

— ■ ■ ■ < ■ 1 ■ 1 ■ — : — ' ■ ■ ■ 1 ■ ' ■ 1 ■ >• 2000 

11 MTB32A (N-ter) 1 1 ■ " 1 1 

LR.GAUGLPSAA [GGGVAVGEPVVArlGNSGGOGGT 



Figure 6 (Cont'd) 



Ra12-'n9-32A.:MPO n>2l9i) Sits and SacLisnca . 

uCCCCG fGC GG TGCC " GGC AGGG TGG rCGCGCTCGGCCAAACCGrGCAGGCGrCGGAFTCGCrGA CCGCi'GCCGAAGAGAC A TTGAACGGG 7 TGATCC AG 

— — , , , ' ' ' '■ - ' - 2! 00 

" MTB32A (N-ter) ~ 

p R a v p g r v v a l g a r v a a s d s l t g a e e r l n g l e q 

TrCGATGCCGCGA-CCAGCCCGGrGATTCGGGCGGGCCCGTCGTCAACGGCC rAGGACAGGTGGrCGGTArGAACACGGCCGCGTCCTAGG 

■ ■ . . . < , L*. 2I9 , 

*" ' .iii MTB32A (N-ter) ) 

F 0 A A I QPGOSGGPVVNGLGQVVGrtNTAAS 
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Figure 7 

Ral2(short) polypeptide (SEQ ID NO: 17) 
TAASDNFQLSQGGQGFAJDPIGQAMAIAGQI 



Figure 8 

Ral2(long) polypeptide (SEQ ID NO: 18) 

T AASDNF QLS QGGQGF AJCPI GQ AMAI AGQIKLPT VHI GPT AFLGL G VVDNNGNGARV 
QRVVGSAPAASLGISTGDVITAVDGAPINSATAMADALNGHHPGDVISVTWQTKSG 
GTRTGNVTL AE GPP A 
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H » N ' l Mq * | H« tag 6aa j ftal 2 (a^oH) 30afl j MtftdlH 2a^ human mammagiob." (full length) 9; 
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are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1 00 1 of Title 1 8 of the United States Code, and that such willful 
false statements may jeopardize the validity of the application or any patent issuing thereon. 



Signature of Inventor 1 


Signature of Inventor 2 


Yasir Skeiky 


Jeffrey Guderian 


Date 


Date 



SF1 143778 



